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ABSTRACT 


This study aims to determine the most suitable models among a set of five 
candidate models to describe the relationships between diameter height and 
diameter volume for individual Shorea robusta trees within the forests of 
Putalaibazar Municipality, Syangja. The methodology involved measuring the 
diameter at breast height and total height of 137 individual trees and calculating 
their respective volumes. The normality of these variables was assessed by a non- 
parametric Kolmogorov-Smirnov test (p < 0.05), which revealed non-normality 
and only five nonlinear models were employed to fit the height-diameter and 
volume-diameter relationship by a transformation of variables. The study 
estimated model parameters, including intercepts and regression coefficients, and 
assessed model performance using fit statistics such as the adjusted coefficient of 
determination (adj. R?) and root mean square error (RMSE). Statistical 
significance of parameters was determined through parametric t-tests for 
regression parameters, with all parameters found to be statistically significant (p < 
0.05). The selection of the best-fitting model was based on models exhibiting the 
highest adjusted coefficient of determination, lowest root mean square error, and 
lowest Akaike Information Criterion (AIC) value. Visual assessments, including 
histogram analyses, normal probability plot curves, and scattered plot diagrams, 
were also employed. Among the models tested, the Wykoff model (H(Height, 
m)=Bh(Breast Height, m)+exp(3.19+(-9.203)/(D+1)), where D represents diameter 
at breast height in cm, demonstrated superior performance for characterizing the 
Diameter-Height relationship. For the Diameter-Total Volume relationship, the 
model V = (-0.049) + 0.001 * D? proved to be optimal. These selected models are 
recommended for predicting the height and volume of individual Shorea robusta 
trees. It is essential to note that these models are explicitly site-specific and should 
be applied exclusively to sites, sizes, and stand conditions congruent with those 
examined in this study. 


Keywords: Model, best-fit, RMSE, adjusted R2, etc. 


1 of 10 


ANALYSIS ARTICLE OPEN ACCESS 


1. INTRODUCTION 


Background 

Tree diameter distribution is one of the most important aspects to be considered by forest managers when making decisions about 
the management of forest stands as it provides a wide range of information, from timber assortments to carbon stock and even 
forest biodiversity. Two key measurements in forest inventories are the height of trees (H) and their diameter at breast height (D). 
Measurement of these variables is important in calculating things like how much space the trees take up, how much they weigh, 
how much carbon they store, and how likely they are to survive (Curtis, 1967). To effectively manage forests at the local, regional, 
or national level, it's crucial to have a good understanding of the size and composition of the forest (Rahman et al., 2022). With this 
knowledge, forest managers can develop strategies that ensure the forest ecosystem continues to thrive. 

In Nepal, the Shorea robusta tree is particularly valuable. It's used for construction and making furniture, and it's the primary 
source of firewood in the Terai region. Shorea robusta leaves are also important as food for animals and for making disposable plates 
(Jackson, 1994). The allometric equation is a valuable tool for establishing a connection between easily measurable morphometric 
variables, such as tree diameter (D), and the overall height and volume of a tree. Traditionally, models describing the relationship 
between tree height (H) and diameter (D) and between tree volume and diameter have been developed and applied primarily in 
pure, even-aged forest stands or plantations. In these models, diameter (D) serves as a predictor variable for tree height (H) in the 
H-D model (Huang et al., 2000). More recent studies have expanded the scope by considering additional stand attributes, including 
site quality, stand age, stand density, and dominant height, in mixed-effect H-D relationship models (Castafio-Santamaria et al., 
2013). 

Mixed-effect H-D relationship models incorporate both population-averaged parameters (fixed parameters common to the 
population) and subject-specific effects as random effects. This approach enhances accuracy compared to nonlinear models that rely 
on minimizing sums of squares. The distribution of tree diameters holds exceptional importance for forest managers when making 
decisions regarding forest stand management. It furnishes a wealth of information, ranging from the types of timber available to the 
estimation of carbon storage and the preservation of forest biodiversity (Pradip-Saud et al., 2016). Therefore, this research aims to 
develop best-fitted height-diameter and diameter volume models for the Shorea robusta forest of Syanga district of western Nepal. 
This developed model may be recommended to forest managers for predicting total heights and volume for S. robusta trees in 
western Nepal and reference for all Nepal. It is expected that the proposed model will be a useful tool for forest managers, forest 


users, and researchers. 


Limitations of the study 

The relationship between a tree's diameter, height, and volume is subject to variation depending on various environmental factors, 
site quality, stand density, stand age, competition, and silvicultural treatments applied, among others (Forrester, 2017). Numerous 
studies have indicated that in dense forests, trees tend to grow taller compared to less dense forests, assuming other factors remain 
constant. Conversely, trees in dense forest environments tend to have smaller diameters, primarily due to heightened competition 
(Lopez-Sanchez et al., 2003; Calama and Montero, 2004). Additionally, it's worth noting that the limited duration of data collection 
resulted in the analysis of data from a relatively small sample of 137 trees. 


2. METHODOLOGY 

Study Site 

The research takes place in the Shorea robusta forest of Putalibazar municipality, Syanja, Nepal. Shorea robusta (Sal), Syzium cumini 
(Jamun), Adina cardifolia (Karma), Lagerstomia parviflora (Botdhayero), Terminalia belirica (Barro), Castanopsis indica (Katus), and others 
are among the principal tree species with slopes ranging from 0 to 48 degrees (Figure 1). The research area is located between 
83.8470 and 83.960 and 28.070 and 28.1190. 
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Figure 1 Study area 


Sampling Design and Data Collection 


16 Kilometers 


|5 


Date 46/2022} 


a 


Forest management plans were used to determine the size variation of the individual species (S. robusta) within the research region. 


To account for a wider range of altitudes, aspects, slopes, stand origins, stand densities, stand age and size classes, and stand 


treatments, the selection of stands was subjective rather than using a complex sampling technique in which the selected individuals 


would have adequately represented the entire population Adinugroho and Sidiyasa, (2006), Diéguez-Aranda et al., (2006), Castedo- 


Dorado et al., (2006) Wolf trees, malformed, top broken, inhibited, leaning, or unhealthy trees were not measured (Sharma, 2009). 


One hundred and thirty-seven (137) trees were measured. The diameter at breast height (DBH) and the trees’ total height were 


measured using diameter tape and vertex-IV, respectively. 


Data Analysis 
Diameter- Height Relationship 


The normality of these variables was assessed using a non-parametric Kolmogorov-Smirnov test (p < 0.05), which revealed non- 


normality. Only five nonlinear models (Table 1) were employed to fit the height-diameter relationship by transformation of 


variables. All of these models have a small number of parameters and are theoretically sound. Thus, they're frequently employed to 


describe different tree and stand characteristics. 


Table 1 Linear models to fit height-diameter relationship. 
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Model | Equation (Linear and Nonlinear) Model Function 
M1 H=Bh+a+bD+ Ei General linear 
M2 H=Bh + exp(atb/(D+1)) + Ei Wykoff 

M3 H=Bh+aDbt+ Ei Lundqvist/Korf 
M4 H=Bh+a exp(-b/d) + Ei Ratkowsky 

M5 H=Bh+ D2/(b+D)2+ Ei Naslund 
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Where, H= Total height(m); D= DBH= Diameter at breast height(cm), 
Bh= Breast height; a,b= regression coefficients(parameters); exp= 
Exponential; Ei= Unexplained error (random) and distributed 


residuals (error). 


Diameter- Total Volume Relationship 

The normality of these variables was assessed by a non-parametric Kolmogorov-Smirnov test (p < 0.05), which revealed non- 
normality and only five nonlinear models (Table 1) were employed to fit the volume-diameter relationship by transformation of 
variables. All of these models have a small number of parameters and are mathematically sound, so they're commonly used to 
model various tree and stand characteristics. 

The following different models were tested using SPSS, 


V= AFD AD ccccssssscsscsocsscssenscescte: (M1) 
In V=atb "In Divs (M2) 
Vi=atb IN Dives (M3) 
In V=atb "Daves (M4) 
V= at b *D2 sccccssssonsssssnnsensnssesens (M5) 


Parameter estimation and model evaluation: In this study, the two most commonly used modeling approaches were also used. The 
first step is to fit the candidate models, and the second step is to evaluate the models that have been fitted. In the first step, 
regression analysis was used to fit candidate models M1 to M5. 

The second step, i.e., the evaluation of the fitted models, was carried out using the following criteria: 

Adjusted coefficient of determination (R2 adj): With adjustments to the number of parameters, p, and the number of non-missing 
observations, n, it indicates the fraction of total variance explained by the model. It is estimated as follows: 


Ry =1-(1—R?) =) 
(n/ p) 

Significance of the parameter values: Parameter estimates should be significantly different from zero (p < 0.05). 
Homogeneity of the residuals: Plotting of the residuals from the model overpredicted values or independent variables should show 
a random, constant variance pattern around a residual value of zero (Clutter et al., 1983). 
Distribution of residuals, i.e., histograms of residuals were plotted to display the distribution (normal or abnormal) patterns of the 
residuals. 
Root Mean Squared Error (RMSE): It determines the accuracy of model predictions, and it is considered one of the most important 
model evaluation criteria. RMSE was calculated using the following formula: 


RMSE = 


Where Yi and Yi are the observed and predicted values, respectively; n is the total number of observations used to fit the model; 
and p is the number of parameters. 

Visual examination of the fitted curves overlaid on the scattered plots of the observed data. It is the most important part of 
modeling. 


3. RESULTS AND DISCUSSION 


The percentage occupancy of the sampled number of species in every 10 cm DBH class interval in ascending order was found to be 
virtually in contrast order of 23, 49, 43, 16, 4, and 2, respectively, as shown in (Table 2). In every average 10 cm DBH starting from 5 
-15 cm to 55 - 65 cm intervals, the average height performance of this species is in the ratio of 1: 1.36: 1.6: 1.62 :1.72: 1.78: 1.62 
respectively, which is not linear means that it is the indication of biological logic for the change of height with respect to DBH. 
Height growth rate (difference in height in relation to DBH) grew up to a specific DBH limit in the early stages, but it decreased 
with growing DBH in the later stages. 
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Table 2 Descriptive statistics of measured species variables 


Diameter | # of DBH (cm) Height (m) Volume (cft) 

Class species | Avg+S.E. Min. Max. Avg +S.E. Min. Max. Avg +S.E. Min. Max. 
5-15 23 10.86+0.51 | 7.1 14.6 114+069 | 74 21.1 0.06+ 0.008 | 0.018 | 0.14 
15-25 49 20.06+0.4 | 15.1 24.7 15.6+ 0.38 10.2 20.6 0.254 0.013 | 0.1 0.48 
25-35 43 30.3 + 0.9 25.1 34.8 18.32 +0.39 | 12.20 | 22.5 0.665+0.024 | 0.98 0.39 
35-45 16 39.04 +0.62 | 35.6 44.6 19.71+0.22 | 18.3 21.3 1.19+0.043 | 0.92 1.52 
45-55 4 49.5+1.84 | 46.6 54.3 20.4+0.72 | 188 22.3 1.97+0.2 171 2.58 
55-65 2 62.15+1.95 | 60.2 64.1 18.55 + 2.37 | 16.2 20.9 2.79+0.18 2.61 2.97 
Pooled 137 25.42+ 0.93 | 7.1 64.1 16.414 0.32 | 74 22.5 0.55+ 0.044 | 0.018 | 2.97 


Because a tree needs more strength to withstand external pressures such as wind, diameter development should be faster than 
height growth when the tree grows to larger and taller sizes, the thickening of its bole should be faster than height growth (Khanna 
and Chaturvedi, 1994; Cato et al., 2006; Sharma, 2009). This species' average DBH and height ratio was discovered to be 1.54:1, and 
its average DBH and volume ratio was determined to be 46.38:1. The standard errors of DBH and height are 6.22 and 4.77, 
respectively, while the standard errors of DBH and total volume are percentages of the estimates of respective variables for this 
species (Kalle, 2001). We found that the average DBH, height, and volume of all species in that forest were 25.42 cm, 16.411 m, and 
0.54 m3, respectively, with standard errors of DBH, height, and volume of 0.93, 0.32, 


Diameter- Height Relationship 
The following table shows the intercept and regression coefficient parameters with estimated values and fit statistics with adjusted 


coefficient of determination (adj. R2) and root mean square error (RMSE) values for the candidate models M1 and M5. 


Table 3 Model parameter with estimates value and fit statistics (n=137). 


Parameter with estimates | Fit Statistics 
Model 

a b Adj. R2 | RMSE | AIC 
M1 10.441 0.235 0.468 1.2636 | 668.553 
M2 3.19 -9,.203 0.664 1.011 | 20.89 
M3 4.075 1.544 0.614 1.138 | -109.491 
M4 0.9818 0.6245 0.704 96.225 | -858.097 
M5 23.594 -8.058 0.663 1.1162 | -128.052 


The estimated values of all parameters were found to be statistically significant (p < 0.05) using the parametric t-test for 
regression parameters, as shown in (Table 3). Except for model M4, four of the models had a root mean square error (RMSE) of less 
than two. As a result, M4 was eliminated from further study due to poor fit statistics, particularly RMSE. Models M1 and M3 were 
eliminated because their adj. R2 and RMSE were lower than those of M2 and M5. The best-fit models for the height-diameter 
connection were M2 and M5, which had higher adj. R2, lowest RMSE, and lower AIC. 

Choosing the best model based solely on the value of adj. R2 and RMSE is not a wise decision. As a result, a graphical analysis 
of residuals was also performed. A residual, which may be thought of as the difference between the data and the fit, is a measure of 
the variability not described by the regression model that is also supported by AIC. The residuals are the errors’ realized or 
observed values. As a result, any deviations from the errors' underlying assumptions should show up in the residuals. The analysis 
of residuals is a useful tool for identifying many types of model flaws (Jayaraman, 2000). The histogram was used to determine if 
the residual distribution was nearly normal or aberrant, as well as which model had the best fit. 
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Figure 2 Histogram for residuals for M2 and M5. 
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Figure 2 shows that the residuals of both models M2 and M5 were determined to be within the usual normal Z value of 4 (99% 


confidence range) and that the residual histogram was roughly bell-shaped with symmetrical (normal) distributions. As a result, 


there was no significant heteroscedasticity issue with the promising model (Sharma, 2009). However, the model M2 revealed a more 


normal distribution of residuals than the model M5, indicating that the normal curve of M2 covers more area of the histogram 


rectangles. Model M2's normal probability plot curves revealed a cluster of residuals pointing towards the equal distribution line, 


whereas M5's did not (Figure 3). It is typical to evaluate the residuals to see if the data meets the assumptions required for the 


regression analysis. 
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Figure 3 Normal P-P plot of syandardized residuals of M2 and M5. 


Normal P-P Plot of Regression Standardized Residual 
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The scatter plot resembles Figure 4, which indicates that the residuals can be contained in a horizontal band on both sides of the 


zero-mean value of residuals, and then there are no obvious model defects. This type of pattern was observed in both the model M2 


and M5. Therefore, no decision can be made on these scatter plots. The scatter plot of residuals versus the corresponding predicted 


(fitted) value is useful for detecting several common types of model inadequacies (Jayaraman, 2000). From the available 


information, we considered the model M2 the best-fitted model among the available ones. 
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Figure 4 Scatter plot of predicted vs. residual heights. 


Diameter- Total Volume Relationship 


The following table shows the intercept and regression coefficient parameters with estimated values and fit statistics with adjusted 


coefficient of determination (adj. R2) and root mean square error (RMSE) values for the candidate models M1 through M5. 
Table 4 Model parameter with estimates values and fit statistics (n=137). 
Parameter with estimates | Fit Statistics 
Model 
a b Adj. R2 RMSE AIC 
M1 -0.613 0.046 0.907 0.5099 -108.98 
M2 5.057 2.435 0.98 1.1388 -109.491 
M3 -2.424 0.948 0.703 0.2863 50.324 
M4 10.234 0.097 0.87 1.089 150.608 
M5 -0.049 0.001 0.973 0.083 -280.531 


From the parametric t-test for regression parameters, Table 4 shows that the estimated values of all parameters were statistically 
significant (p < 0.05). The root mean square error (RMSE) was more than 1 in two of the models (M2 and M4). As a result, M2 and 
M4 were left out of the final analysis due to poor fit statistics, particularly RMSE. Models M3 and M4 were eliminated because they 
had a lower adj. R2 and a greater RMSE than M1 and M5. M1 and M5 were the best-fit models for the total volume-diameter 
relationship, with higher adj. R square, lowest RMSE, and lower AIC. In addition, the RMSE and AIC of model M5 are significantly 
lower than those of model M1, and the adjusted R2 of model M5 is higher than that of model M1, indicating that model M5 is better 
suited to the Diameter-Volume relationship. 

However, choosing the best model solely based on adj. R2 and RMSE is not a good decision. As a result, we ran a graphical 
analysis of the residuals. A residual, which can be thought of as the difference between the data and the fit, is a measure of the 
variability not explained by the regression model, and AIC also supports it. The realized or observed values of the errors are known 
as residuals. As a result, any deviations from the errors' underlying assumptions should be reflected in the residuals. The use of 
residual analysis to investigate a variety of model flaws is a powerful tool (Jayaraman, 2000). The histogram was used to determine 
if the residual distribution was approximately normal or aberrant and which model best fit the data. 
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Figure 5 Histogram for residuals for M1 and M5. 


Figure 5 Illustrates that the residual of models M1 and M5 were found within standard normal Z value as 4 (99% confidence 
interval) and the residual histogram looked approximately bell-shaped with symmetrical (normal) distributions except for some 
outliers. Thus, this implied that there was no substantial heteroscedasticity problem with the promising model (Sharma, 2009). But 
comparatively, model M5 showed a more normal distribution of residuals than model M1, so the normal curve of M5 covers more 
area of the rectangles of the histogram. Model M5's normal probability plot curves revealed a cluster of residuals pointing towards 
the equal distribution line, whereas M1's did not (Figure 6). It is typical to evaluate the residuals to see if the data meets the 
assumptions required for the regression analysis. 
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Figure 6 Normal P-P plot of syandardized residuals of M1 and M5. 


If the residuals can be contained in a horizontal band on both sides of the zero-mean value of residuals, the scatter plot looks 
like Figure 7, indicating that there are no evident model flaws. Model M5 revealed this pattern. As a result, model M5 is rated the 
best fit among the available models. The scatter plot of residuals vs. the matching anticipated (fitted) value can be used to spot a 
variety of model flaws (Jayaraman, 2000). Based on the information supplied, we determined that the model M5 was the best-fitting 
model among the various models. 
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Figure 7 Scatter plot of predicted vs. residual heights. 


4. CONCLUSION 

Among several modes tested in this study, H(Height, m) = Bh(Breast Height, m) + exp(3.19+(-9.203)/(D+1)) (Diameter at breast 
height (D) in cm) , Wykoff model showed the best performance in terms of numerical significant parameters, fit statistics and 
graphical appearance for Diameter- Height relationship and V = (-0.049) +0.001 *D2 for Diameter-Total Volume relationship. As a 
result, these models are suggested for predicting the height and volume of individual Shorea robusta trees. Because this model is 
expressly site-specific, it must be applied to sites, sizes, and stand conditions that are identical to those used in this study. 
Validation, verification, and re-calibration of the proposed model using additional data from a broader range of sites, sizes, and 
stand conditions of Shorea robusta trees is also suggested. More research is needed to build a strong association between diameter 
and height, as well as diameter and total volume, by incorporating a large number of data from a wide range of distributions. 
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